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Abstract 

Based on criteria of mathematical simplicity and consistency with 
empirical market data, a stochastic volatility model is constructed, 
the volatility process being driven by fractional noise. Price return 
statistics and asymptotic behavior are derived from the model and 
compared with data. Deviations from Black-Scholes and a new option 
pricing formula are also obtained. 

Keywords: Fractional noise, Induced volatility, Statistics of returns. 
Option pricing 

1 Introduction 

Classical Mathematical Finance has, for a long time, been based on the 
assumption that the price process of market securities may be approximated 
by geometric Brownian motion 

dSt = i^Stdt + aStdB{t) (1) 

In liquid markets the autocorrelation of price changes decays to negligible 
values in a few minutes, consistent with the absence of long term statistical 



*Centro de Matematica e Aplicagoes Fundamentals and Universidade Tecnica de Lis- 
boa, e-mail: vilela@cii.fc.ul.pt 

^Centro de Matematica e Aplicagoes Fundamentais and Universidade Aberta, 
oliveira@cii .fc.ul.pt 



1 



arbitrage. Geometric Brownian motion models this lack of memory, although 
it does not reproduce the empirical leptokurtosis. On the other hand, non- 
linear functions of the returns exhibit significant positive autocorrelation. 
For example, there is volatility clustering, with large returns expected to be 
followed by large returns and small returns by small returns (of either sign). 
This, together with the fact that autocorrelations of volatility measures de- 
cline very slowly pQ [2] [5], has the clear implication that long memory effects 
should somehow be represented in the process and this is not included in the 
geometric Brownian motion hypothesis. 

One other hand, as pointed out by Engle[4j, when the future is uncertain 
investors are less likely to invest. Therefore uncertainty (volatility) would 
have to be changing over time. The conclusion is that a dynamical model for 
volatility is needed and a in Eq.([T]), rather than being a constant, becomes a 
process by itself. This idea led to many deterministic and stochastic models 
for the volatility ([5] [6] and references therein). 

Using, at each step, both a criteria of mathematical simplicity and consis- 
tency with market data, a stochastic volatility model is constructed here, with 
volatility driven by fractional noise. It appears to be the minimal model con- 
sistent both with mathematical simplicity and the market data. It turns out 
that this data-inspired model is different from the many stochastic volatility 
models that have been proposed in the literature. The model will be used to 
compute the price return statistics and asymptotic behavior, which are com- 
pared with actual data. Deviations from the classical Black-Scholes result 
and a new option pricing formula are also obtained. 

2 The induced volatility process 

The basic hypothesis for the model construction are: 

(HI) The log-price process log 5*4 belongs to a probability product space 
f2 ® f2 of which the first one, f2, is the Wiener space and the second, ^2 , is a 
probability space to be characterized later on. Denote by a; G and uj 
the elements (sample paths) in VL and VL and by J^t and JF^ the a— algebras in 
VL and Vl generated by the processes up to t. Then, a particular realization 
of the log-price process is denoted 



This first hypothesis is really not limitative. Even if none of the non-trivial 
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stochastic features of the log-price were to be captured by Brownian motion, 
that would simply mean that St is a trivial function in Q. 

(H2) The second hypothesis is stronger, although natural. We will assume 
that for each fixed u , log St (•, u; ) is a square integrable random variable in 

n. 



From the second hypothesis it follows that, for each fixed to' , 

f^(.,cu') = fit{;u;')dt + (Tt{;u')dB{t) (2) 

where fit (•, i^') and at (•, w') are well-defined processes in il. (Theorem 1.1.3 
in Ref.[7J) 

Recall that if {Xf,jF^} is a process such that 

dXt = pLtdt + atdB (t) (3) 
with fit and at being adapted processes, then 



fit = fm^{EiXt+,-Xt)\J't} 
al = lhni{^(^*+^-^*)V*} 



The process associated to the probability space Q,' is now to be inferred 
from the data. According to (jlj), for each fixed u realization in Q one has 

a^(^;uj'^=Yim^{Ei\ogSt+e-\ogStf} (5) 

Each set of market data corresponds to a particular realization to' . Therefore, 
assuming the realization to be typical, the cr^ process may be reconstructed 
from the data by the use of ([5]). To this data-reconstructed at process we 
call the induced volatility. 

For practical purposes we cannot strictly use Eq.([5]) to reconstruct the 
induced volatility process, because when the time interval e is very small the 
empirical evaluation of the variance becomes unreliable. Instead, we estimate 
at from 

= T^var (log St) (6) 

~ i| 
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with a time window \Tq —Ti\ sufficiently small to give a reasonably local 
characterization of the volatility, but also sufficiently large to allow for a 
reliable estimate of the local variance of log St. 

As an example, daily data has been used with time windows of 5 to 9 days. 
The upper left panel of Fig. 1 shows the result of application of to the New 
York Stock Exchange (NYSE) aggregate index in the period 1966—2000, with 
a time window \Tq — Ti\ =5 days. Notice that to discount trend effects and 
approach asymptotic stationarity of the process, before application of ([6]), 
the data has been detrended and rescaled as explained in Ref.j8|. Namely, a 
polynomial fit is performed for increasing orders until the fitted polynomial is 
no longer well conditioned. This seems to be a reasonable detrending method 
insofar as it leads to an asymptotically stationary signal [5]. 

Then, as a ffist step towards finding a mathematical characterization of 
the induced volatility process one looks for scaling properties. Namely one 
checks whether a relation of the form 

E \(T {t + A) - a {t)\ r-. (7) 



or 



E 



a{t + A) - a (t) 



holds for the induced volatility process. This would be the behavior implied 
by most stochastic volatility models proposed in the past. It turns out that 
the data shows this to be a very bad hypothesis, meaning that the induced 
volatility process itself is not self-similar. 

Instead, using a standard technique to detect long-range dependencies [9], 
one computes the empirical integrated log-volatility and finds that it is well 
represented by a relation of the form 

t/5 

Y,^oga{n6) = f3t + R„{t) (9) 

n=0 

where, as shown in the lower right panel of Fig.l, the (t) process has very 
accurate self-similar properties {6 = 1 day for daily data). 
This suggests the following mathematical identification: 
(a) Recall that if a nondegenerate process Xt has finite variance, station- 
ary increments and is self-similar 

Law (Xat) = Law {a^Xt) (10) 
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Figure 1: Self-similar properties of the integrated log- volatility pt + R„ (t) 
process 



then PJ < if < 1 and 

Cov (X„ X,) = l{\sf'' + -\s- tr) E {Xl) (11) 

The simplest process with these properties is a Gaussian process called frac- 
tional Brownian motion. Fractional Brownian motion [TT] 

E [Bh m = {t) Bh {s)] = \{\t?''+ kl'"" -\t- s\^''] (12) 

has, for if > i , a long range dependence 

oo 

Y,Cov{BH{l),BH{n + l)- BH{n)) = ^ (13) 

n=l 

(b) Therefore, mathematical simplicity suggests the identification of the 
Ra (t) process with fractional Brownian motion. 

(t) = kBn (t) (14) 
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From the data one obtains the Hurst coefficient H ~ 0.8 (for the NYSE 
index). The same parametrization holds for the data of all individual com- 
panies that were tested, with H in the range 0.8 — 0.9. 

For comparison the plot in the down left panel of Fig. 1 shows the scaling 
test for the (NYSE) price process where, unlike the {t) process, clear 
deviations are seen on the first few days. 

From ^ and the identification f[T^ one concludes that the induced 
volatility may be modeled by 

logdi = /5 + ^ [Bh it) -Bnit- 6)) (15) 

5 being the observation time scale (one day, for daily data). It means that 
the volatility is not driven by fractional Brownian motion but by fractional 
noise. For the volatility (at resolution 6) 

a(t) = 0et^^-W-^-(*-^»-K!)'^^" (16) 

the term — ^ (|)^ 6"^^ being included to insure that E (a (t)) = 9. 
Eqs. ([2]) and (fT5|) define a stochastic volatility model. 

dSt = fiStdt + atStdB (t) , , 

logat = P + l{BH{t)-BHit-5)} 

In this coupled stochastic system, in addition to a mean value, volatility is 
driven by fractional noise. Notice that this empirically based model is differ- 
ent from the usual stochastic volatility models which assume the volatility to 
follow an arithmetic or geometric Brownian process. Also in the Comte and 
Renault model[T2], it is fractional Brownian motion that drives the volatility, 
not its derivative (fractional noise). 6 is the observation scale of the process. 
In the (5 — > limit the driving process would be the distribution-valued 
process Wh 

Wh = lim i (Bh (t) -Bnit- 6)) (18) 

In (|T71) the constant k measures the strength of the volatility randomness. 
Although phenomenologically grounded and mathematically well specified, 
the stochastic system f|T71) is still a limited model because, in particular, the 
fact that the volatility is not correlated with the price process excludes the 
modehng of leverage effects. It would be simple to introduce, by hand, such 
a correlation in the second equation in f[T7|) . However we do prefer not to do 
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so at this time, because have not yet found a natural way to do it, which is 
as clear-cut and imposed by the data as the approach that led to fll7l) . 



3 The statistics of price returns 

Here one computes the probability distribution of price returns implied by 
the stochastic volatility model (fT7|l . From (fTS!) one concludes that logut is a 
Gaussian process with mean (3 and covariance 

^^s,u) = ^\\s-u + + \u-s + <5|'^ - 2 |s - (19) 

This Gaussian process has non-trivial correlation for H At each fixed 
time logcxt is a Gaussian random variable with mean (3 and variance 
Then, 

therefore 

ps (log ^) = r d^p^ (^) flog ¥) (21) 

with 



'Jt / Jo \ 



,, Sr\ 1 I ('«S (t) -'))^ 

Per log = ^^^^^^^= ^^P ^ ^ ' ^ 



5t ; ^2na^ (T - 1) ) 2^2 (T - 1) 

(22) 

One sees that the effective probability distribution of the returns might de- 
pend both on the time lag A = T — t and on the observation time scale 5 
used to construct the volatility process. That this latter dependence might 
actually be very weak, seems to be implied by some surprising experimental 
results. 

Before obtaining a closed form expression for P^- ^log ^ j and its asymp- 
totic behavior, we will present some comparisons with market data. For the 
Fig. 2 the same NYSE one-day data as before is used to fix the parameters 
of the volatility process. Then, using H = 0.83, k = 0.59, (3 = —5, S = 1, 
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Figure 2: One-day NYSE returns compared with the model predictions and 
the lognormal 

the one-day return distribution predicted by the model is compared with 
the data. The agreement is quite reasonable. For comparison a log-normal 
with the same mean and variance is also plotted in Fig.2. Then, in Fig. 3, 
using the same parameters, the same comparison is made for the A = 1 and 
A = 10 data. 

Fig. 4 shows a somewhat surprising result. Using the same parameters 
and just changing A from 1 (one day) to A = ^ (one minute), the prediction 
of the model is compared with one-minute data of USDoUar-Euro market for 
a couple of months in 2001. The result is surprising, because one would not 
expect the volatility parametrization to carry over to such a different time 
scale and also because one is dealing with different markets. A systematic 
analysis of high-frequency data is now being carried out to test the degree of 
time-scale dependence of the volatility parametrization and its universality 
over different markets. 

In Fig. 5 and Fig.6 we have displayed the same one-day and one-minute 
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Figure 3: One and ten-days NYSE returns compared with the model predic- 
tions 



return data discussed before as well as the predictions of the model both in 
semilogarithmic and loglog plots. 

Now we will establish a closed-form expression for the returns distribu- 
tion and its asymptotic behavior. Using (12UI) and (1221) in (12T]) and changing 
variables one obtains 
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A7rek6^-WA Jo 



(23) 



with 



r(A) = logS'T-log^i, e = e\ A = T-t, X 



(r(A)-ro)' 



and 



a 



(24) 
(25) 
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Figure 4: One-minute USD-Euro returns compared with the model predic- 
tions, with parameters obtained from one-day NYSE data 

Expanding the exponential in (123!) 

n=0 ■ ^ /JO 

- -^) (26) 

n=0 ^ ^ 

Finally 

with asymptotic behavior, for large returns 

P5(r(A)) ~ -^e-^'°s'^ (28) 
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Figure 5: Semilogarithmic and loglog plots of NYSE data 



On the other hand, as seen from Figs. 5 and 6, the exact result (l23l) 
or (127|) resembles the double exponential distribution recognized by Silva, 
Prange and Yakovenko[13j as a new stylized fact in market data. The double 
exponential distribution has been shown, by Dragulescu and Yakovenko|14]. 
to follow from Heston's [TB] stochastic volatility model. Notice however that 
our model is different from Heston's model in that volatility is driven by a 
process with memory (fractional noise). As a result, despite the qualitative 
similarity of behavior at intermediate return ranges, the analytic form of the 
distribution and the asymptotic behavior are different. 



4 Option pricing 



Assuming risk neutrality [IH], the value V {St, crt, t) of an option is the present 
value of the expected terminal value discounted at the risk-free rate 



V{St,at,t) 



V {St, ctt, T) p {ST\St, at) (ISt 



(29) 
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Figure 6: Semilogarithmic and loglog plots of USD-Euro data 

V [St, cttjT) = max [0, S — K] and the conditional probability for the ter- 
minal price depends on St and at. K is the strike price, T the maturity time 
and St and at the price and volatility of the underlying security. 

Whenever the drift of a financial time series can be replaced by the risk- 
free rate we are in a risk-neutral situation. In stochastic volatility models 
(with or without fractional noise) this is not an accurate assumption. Nev- 
ertheless we will make use of (129|) to obtain an approximate estimate of the 
deviations from Black-Scholes implied by the stochastic differential model 
(fTTll . As in Hull and White [17] , we make use of the relation between condi- 
tional probabilities of related variables, namely 

p{ST\St,at)= p {St\ St, log a) p (log a\ log at) d (log a) (30) 



logo" being the random variable 



log(T=^^— logasds (31) 
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that is, logo" is the mean volatihty from time t to the maturity time T 
conditioned to an average value logcr^ at time t. Then Eg. (12^ becomes 

V{St,at,t) = j C [St, e^, t) p log at) d (b^) (32) 

c[Sue^,t) = j e-^^^-'^V {ST.aT.T)p{ST\St,\^) dSr (33) 
C ^5*4, e'"^"^, t j being the Black-Scholes price for an option with average 



volatility e'°s'^, which is known to be [18] [19] 

C {St, t) =St{a + b)N (a, b) - Ke-''^^^'^ {a - b) N (a, -b) (34) 

with 



it 

K 

/T-t 

b = %VT^t 

and 



1 /'OO 2 

N{a,b) = ^ dye-"-^''^'^' (36) 

In a stochastic volatility model with fractional noise, instead of V {St, Ct, t), 
it would be more correct to write V {St, cr<t, t) to emphasize the dependence 
on the past. For simplicity we have used the first notation, with the provision 
that at no point, in the calculation below, Markov properties of the processes 
should be assumed, only their Gaussian nature. 

To compute the conditional probability p (logcr| logo"*) it follows from 
( IT71) that the process logo" conditioned to logo"j at t is 

= log at + ^ ^ds ^ {dBn (r) - dBn {r - S)) (37) 



Notice that, because we want to compute the conditional probability of logo" 
given log at at time t, at in Eq. (!37|) is not a process but simply the value of 
the argument in the V {St,at,t) function. 

As a t— dependent process the double integral in (!37|) is a centered Gaus- 
sian process. Therefore, given logo"t at time t, logo" is a Gaussian variable 
with conditional mean and variance 

{logo"| logo"t} = logo"j (38) 
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a" 



£'|(loga - log(Tt)^ I log (Ttj 

52 (T - ty 



Expanding // [dBn (r) - rf^^ (r - 5)] = (s) - 5^ (t) - 5^ (s - 5) + 
Bh (t — S) and using ( fT2l) one obtains 

with 

/i = {(T - t + 5)^^^^ + (T - t - 5)^^^^ - 2 (T - tr^' - 25-^^-] 

(41) 

/2 = — ^ {2 (T - tf"^' ~iT-t + df"^' -iT-t- df"^'] 
2H +11 J 

As is seen by expanding Ji and I2, when t T one has the consistency 
condition —>■ 0. However, in general, for option pricing purposes, 6 <C 
(T — t) and one may approximate 



«'-^|1-(2^-1)(t^) 1 (42) 



/ \ 1 f — (logo" — log CTf)' . 

P (log (T\\ogat) = ^ exp <^ — (43) 



Finally 



and from (!32|) 

/oo 
rfeC(5i,e«,t)p(e|logai) (44) 
-00 

one obtains 

V {St, at, t) = St [aM (a, a, 6) + bM {a, b, a)]-Ke-"(^-*^ [aM (a, a, -6) - bM {a, -6, a)] 

(45) 
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1 poo nOO 2 2 / N 2 

M{a,a,b) = ^ dy dxe-'£^e-'^(""+^) (46) 

= — fl r dx^-^erfc(-—- 

as a new option price formula (erfc is the complementary error function and 
a and b are defined in Eq. (l5Si) ). with a replaced by at- 

Eqs. (jH]) and fH5l) are mathematically equivalent. For computational 
convenience (of the reader that might want to use our formula) we point out 
that, instead of writting performing codes for the M-functions in Eq. fHSj) . 
he might simply use a Black-Scholes code and perform the integration in 
Eq.dB]). 




Figure 7: Option price and equivalent implied volatility in the "risk-neutral" 
approach to the stochastic volatility model 

In Fig. 7 we plot the option value surface for V {St, Ct, t) in the range T — 
t G [5, 100] and S/K G [0.5, 1.5] as well as the difference {V {St, at, t) - C {St, at, t)) /K 
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for k = 1 and k = 2. The other parameters are fixed at cr = 0.01, r = 
0.001,(5 = l,H = 0.8. 

To compare the predictions of our formula with the classical Black-Scholes 
(BS) result, we have computed the implied volatility required in the BS model 
to reproduce our results. This is plotted in the lower right panel of Fig. 7 
which shows the implied volatility surface corresponding to V {St,crt,t) for 
k = 1. One sees that, when compared to BS, it predicts a smile effect with 
the smile increasing as maturity approaches. 

5 Conclusions 

(a) In this paper, rather than starting by postulating some model for the 
market process and then exploring its better or worse vindication by the 
data, the approach has been to be inspired, at each step of its construc- 
tion, both by mathematical simplicity and consistency with the data. It is 
mathematically more complex and requires (for example for a derivation of 
option pricing without assuming risk-neutrality) more sophisticated tools of 
Malliavin calculus than most stochastic volatility models. Nevertheless, from 
its very construction and consistency with the data, it appears as a kind of 
minimal model. 

(b) The asymptotic behavior of price returns, in special its asymptotic be- 
havior has been much discussed (see for example [20j and references therein). 
In particular it has been proposed that the large return tail decays as a power 
law, although a stretched exponential might provide a better fit [21]. 

The semilogarithmic plots suggest in fact that a better overall fit might 
be obtained by a stretched exponential or indeed by Eqs. (1271) and 

(c) From the data and model comparison plotted in the figures it looks 
that the stochastic volatility model (as well as a scaling hypothesis) cannot 
fit the very large deviations. There is a good fit for the bulk of the data but 
there are also a few events very far from the fit. It suggests that a model 
with two probability spaces is still not enough to capture the whole process. 
Maybe one should write St [uj,u ,u ) with the last entry, u , representing 
exogenous market shocks. 
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